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Abstract 

The singular value decomposition (SVD) is a popular matrix factor- 
ization that has been used widely in applications ever since an efficient 
algorithm for its computation was developed in the 1970s. In recent years, 
the SVD has become even more prominent due to a surge in applications 
and increased computational memory and speed. 

To illustrate the vitality of the SVD in data analysis, we highlight three 
of its lesser-known yet fascinating applications: the SVD can be used 
to characterize political positions of Congressmen, measure the growth 
rate of crystals in igneous rock, and examine entanglement in quantum 
computation. We also discuss higher-dimensional generalizations of the 
SVD, which have become increasingly crucial with the newfound wealth 
of multidimensional data and have launched new research initiatives in 
both theoretical and applied mathematics. With its bountiful theory and 
applications, the SVD is truly extraordinary. 



1 In the Beginning, There is the SVD. 

Let's start with one of our favorite theorems from linear algebra and what is 
perhaps the most important theorem in this paper. 

Theorem 1 Any matrix A <S M. mxn can be factored into a singular value de- 
composition (SVD), 

A = USV T , (1) 

where U G R mxm and V g R" xn are orthogonal matrices (i.e., UU T = VV T 
I , and S € R mx ™ j s diagonal with r = rank(^4) leading positive diagonal entries. 
The p diagonal entries of S are usually denoted by o~i for i = 1 , . . . , p, where 
p = min{m,?i}, and o~i are called the singular values of A. The singular values 
are the square roots of the nonzero eigenvalues of both AA T and A T A, and they 
satisfy the property u\ > <7<i > • • • > a p . 

See Ref. [66] for a proof. 

Equation ([T]) can also be written as a sum of rank-1 matrices, 

r 

A = ViUiVi , (2) 
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where at is the ith singular value, and Ui and Uj are the ith columns of U and 
V. 

Equation @ is useful when one wants to estimate A using a matrix of lower 
rank [23]. 

Theorem 2 (Eckart- Young) Let the SVD of A be given by |T]). If k < r = 

k 

rank(A) and Ah = ^""^ o~iUjvf , i/ien 
i=i 

min ||i4-S|| a = A fe || 2 = a fe+1 . (3) 

rank(B)=fe 

See Ref. [27] for a proof. 

The SVD was discovered over 100 years ago independently by Eugenio 
Beltrami (1835-1899) and Camille Jordan (1838-1921) [55]. James Joseph 
Sylvester (1814-1897), Erhard Schmidt (1876-1959), and Hermann Weyl (1885- 
1955) also discovered the SVD using different methods [55J. The development in 
the 1960s of practical methods for computing the SVD transformed the field of 
numerical linear algebra. One method of particular note is the Golub and Rcin- 
sch algorithm from 1970 [25]. See Ref. Q3] for an overview of properties of the 
SVD and methods for its computation. See the documentation for the Linear 
Algebra Package (LAPACK) [5J for details on current algorithms to calculate 
the SVD for dense, structured, or sparse matrices. 

Since the 1970s, the SVD has been used in an overwhelming number of ap- 
plications. The SVD is now a standard topic in many first-year applied math- 
ematics graduate courses and occasionally appears in the undergraduate cur- 
riculum. Theorem [2] is one of the most important features of the SVD, as it is 
extremely useful in least-squares approximations and principal component anal- 
ysis (PCA) . During the last decade, the theory, computation, and application of 
higher-dimensional versions of the SVD (which are based on Theorem [2J have 
also become extremely popular among applications with multidimensional data. 
We include a brief description of a higher-dimensional SVD in this article, and 
invite you to peruse Ref. |36j and references therein for additional details. 

We will not attempt in this article to summarize the hundreds of applications 
that use the SVD, and our discussions and reference list should not be viewed 
as even remotely comprehensive. Our goal is to summarize a few examples of 
recent lesser-known applications of the SVD that we enjoy in order to give a 
flavor of the diversity and power of the SVD, but there are a myriad of others. 
We mention some of these in passing in the next section, and we then focus 
on examples from Congressional politics, crystallization in igneous rocks, and 
quantum information theory. We also discuss generalizations of the SVD before 
ending with a brief summary. 
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2 It's Raining SVDs (Hallelujah)! 



The SVD constitutes one of science's superheroes in the fight against monstrous 
data, and it arises in seemingly every scientific discipline. 

One finds the SVD in statistics in the guise of "principal component anal- 
ysis" (PC A), which entails computing the SVD of a data set after centering 
the data for each attribute around the mean. Many other methods of mul- 
tivariate analysis, such as factor and cluster analysis, have also proven to be 
invaluable [H| . The SVD per se has been used in chemical physics to obtain ap- 
proximate solutions to the coupled-cluster equations, which provide one of the 
most popular tools used for electronic structure calculations [34) . Additionally, 
one applies an SVD when diagonalizing the one-particle reduced density matrix 
to obtain the natural orbitals (i.e., the singular vectors) and their occupation 
numbers (i.e., the singular values). The SVD has also been used in numerous 
image-processing applications, such as in the calculation of Eigenfaces to pro- 
vide an efficient representation of facial images in face recognition [49ll68ll69] . 
It is also important for theoretical endeavors, such as path-following methods 
for computing curves of equilibria in dynamical systems [22] . The SVD has also 
been applied in genomics [SHU], textual database searching [TT] , robotics [Jj, 
financial mathematics |25j . compressed sensing [74) . and more. 

Computing the SVD is expensive for large matrices, but there are now al- 
gorithms that offer significant speed-up (see, for example, Refs. [I0[ , [39 ] ) as well 
as randomized algorithms to compute the SVD [40]. The SVD is also the basic 
structure for higher-dimensional factorizations that are S VD-likc in nature [36] ; 
this has transformed computational multilinear algebra over the last decade. 

3 Congressmen on a Plane. 

In this section, we use the SVD to discuss voting similarities among politicians. 
In this discussion, we summarize work from Refs. [561157] . which utilize the SVD 
but focus predominantly on other items. 

Mark Twain wrote in Pudd'nhead Wilson's New Calendar that "It could 
probably be shown by facts and figures that there is no distinctly American 
criminal class except Congress" [70] ■ There are aspects of this snarky comment 
that are actually pretty accurate, as much of the detailed work in making United 
States law is performed by Congressional committees and subcommittees. (This 
differs markedly from parliamentary democracies such as Great Britain and 
Canada.) 

There are many ways to characterize the political positions of Congressmen. 
An objective approach is to apply data-mining techniques such as the SVD 
(or other "multidimensional scaling" methods) on matrices determined by the 
Congressional roll call. Such ideas have been used successfully for decades by 
political scientists such as Keith Poole of UC San Diego and Howard Rosenthal of 
Princeton University [54 [ [55] . One question to ask, though, is what observations 
can be made using just the SVD. 
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In Rcfs. [55)157] , the SVD was employed to investigate the ideologies of Mem- 
bers of Congress. Consider each two-year Congress as a separate data set and 
also treat the Senate and House of Representatives separately. Define an m x n 
voting matrix A with one row for each of the m legislators and one column for 
each of the n bills on which legislators voted. The element Aij has the value +1 
if legislator i voted "yea" on bill j and —1 if he or she voted "nay." The sign of 
a matrix clement has no bearing a priori on conscrvativism versus liberalism, as 
the vote in question depends on the specific bill under consideration. If a legis- 
lator did not vote because of absence or abstention, the corresponding element 
is 0. Additionally, a small number of false zero entries result from resignations 
and midterm replacements. 

Taking the SVD of A allows one to identify Congressmen who voted the 
same way on many bills. Suppose the SVD of A is given by ([2]). The grouping 
that has the largest mean-square overlap with the actual groups voting for or 
against each bill is given by the first left singular vector u\ of the matrix, the 
next largest by the second left singular vector v,2, and so on. Truncating A 
by keeping only the first k < r nonzero singular values gives the approximate 
voting matrix 

k 

A k = Yj a i U i V i ~ A - ( 4 ) 
i=l 

This is a "fc-mode truncation" (or "fc-mode projection") of the matrix A. By 
Theorem [51 (QJ is a good approximation as long as the singular values decay 
sufficiently rapidly with increasing i. 

A Congressman's voting record can be characterized by just two coordi- 
nates |56U57j . so the two-mode truncation Ai is an excellent approximation to 
A. One of the two directions (the "partisan" coordinate) correlates well with 
party affiliation for members of the two major parties. The other direction (the 
"bipartisan" coordinate) correlates well with how often a Congressman votes 
with the majority!]] We show the coordinates along these first two singular 
vectors for the 107th Senate (2001-2002) in Fig. [TJi. As expected, Democrats 
(on the left) are grouped together and are almost completely separated from 
Republicans (on the right) o The few instances of party misidentification are 
unsurprising; Conservative Democrats such as Zell Miller [D-GA] appear far- 
ther to the right than some moderate Republicans [12] . Senator James Jeffords 
[I-VT], who left the Republican party to become an Independent early in the 
107th Congress, appears closer to the Democratic group than the Republican 
one and to the left of several of the more conservative Democratsll 

Equation ((!]) can also be used to construct an approximation to the votes in 
the full roll call. Again using A2, one assigns "yea" or "nay" votes to Congrcss- 

x For most Congresses, it suffices to use a two- mode truncation. For a few, it is desirable 
to keep a third singular vector, which can be used to try to encapsulate a North-South divide 

2 Strictly speaking, the partisanship singular vector is determined up to a sign, which is 
then chosen to yield the usual Left/Right convention. 

3 Jeffords appears twice in Fig.[T^ — once each for votes cast under his two different affilia- 
tions. 
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Figure 1: Singular value decomposition (SVD) of the Senate voting record from 
the 107th U.S. Congress (2001-2002). (a) Two-mode truncation A 2 of the vot- 
ing matrix A. Each point represents a projection of a single Representative's 
votes onto the leading two eigenvectors (labeled "partisan" and "bipartisan," 
as explained in the text). Democrats (light dots) appear on the left and Re- 
publicans (medium dots) are on the right. The two Independents are shown 
using dark dots . (b) "Predictability" of votes cast by Senators in the 107th 
Congress based on a two-mode truncation of the SVD. Individual Senators range 
from 74% predictable to 97% predictable. These figures are modified versions 
of figures that appeared in Ref. [55] , 
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Figure 2: SVD of the roll call of the 107th House of Representatives projected 
onto the voting coordinates. There is a clear separation between bills that passed 
(dark dots) and those that did not (light dots) . The four corners of the plot are 
interpreted as follows: bills with broad bipartisan support (north) all passed; 
those supported mostly by the Right (east) passed because the Republicans 
were the majority party; bills supported by the Left (west) failed because of the 
Democratic minority; and the (obviously) very few bills supported by almost 
nobody (south) also failed. This figure is a modified version of a figure that 
appeared in Ref. [56] . 



men based on the signs of the matrix elements. Figure [T]d shows the fraction 
of actual votes correctly reconstructed using this approximation. Looking at 
whose votes are easier to reconstruct gives a measure of the "predictability" of 
the Senators in the 107th Congress. Unsurprisingly, moderate Senators are less 
predictable than hard-liners for both parties. Indeed, the two-mode truncation 
correctly reconstructs the votes of some hard-line Senators for as many as 97% 
of the votes that they cast. 

To measure the reproducibility of individual votes and outcomes, the SVD 
can be used to calculate the positions of the votes along the partisanship and 
bipartisanship coordinates (see Fig. [5]). One obtains a score for each vote by 
reconstituting the voting matrix as before using the two-mode truncation A% 
and summing the elements of the approximate voting matrix over all legislators. 
Making a simple assignment of "pass" to those votes that have a positive score 
and "fail" to all others successfully reconstructs the outcome of 984 of the 990 
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total votes (about 99.4%) in the 107th House of Representatives. A total of 735 
bills passed, so simply guessing that every vote passed would be considerably 
less effective. This way of counting the success in reconstructing the outcomes of 
votes is the most optimistic one. Ignoring the values from known absences and 
abstentions, 975 of the 990 outcomes are still identified correctly. Even the most 
conservative measure of the reconstruction success rate — in which one ignores 
values associated with abstentions and absences, assigns individual yeas or nays 
according to the signs of the elements of A2, and then observes which outcome 
has a majority in the resulting roll call — identifies 939 (about 94.8%) of the 
outcomes correctly. The success rates for other recent Houses are similar [5B] . 

To conclude this section, we remark that it seems to be underappreciated 
that many political scientists are extremely sophisticated in their use of mathe- 
matical and statistical tools. Although the calculations that we discussed above 
are heuristic ones, several mathematicians and statisticians have put a lot of ef- 
fort into using mathematically rigorous methods to study problems in political 
science. For example, Donald Saari has done a tremendous amount of work on 
voting methods jB(J| , an d (closer to the theme of this article) rigorous arguments 
from multidimensional scaling have recently been used to study roll-call voting 
in the House of Representatives [H] • 

4 The SVD is Marvelous for Crystals. 

Igneous rock is formed by the cooling and crystallization of magma. One inter- 
esting aspect of the formation of igneous rock is that the microstructure of the 
rock is composed of interlocking crystals of irregular shapes. The microstructure 
contains a plethora of quantitative information about the crystallization of deep 
crust — including the nucleation and growth rate of crystals. In particular, the 
three-dimensional (3D) crystal size distribution (CSD) provides a key piece of 
information in the study of crystallization rates. CSD can be used, for example, 
to determine the ratio of nucleation rate to growth rate. Both rates are slow in 
the deep crust, but the growth rate dominates the nucleation rate. This results 
in a microstructure composed of large crystals. See Ref. |59j for more detail 
on measuring growth rates of crystals and Refs. [30l|42] for more detail on this 
application of the SVD. 

As the crystals in a microstructure become larger, they compete for growth 
space and their grain shapes become irregular. This makes it difficult to measure 
grain sizes accurately. CSD analysis of rocks is currently done in two stages. 
First, one takes hand measurements of grain sizes in 2D slices and then computes 
statistical and stereological corrections to the measurements in order to estimate 
the actual 3D CSD. However, a novel recent approach allows one to use the SVD 
to automatically and directly measure 3D grain sizes that are derived from 
three specific crystal shapes (prism, plate, and cuboid; see Fig. |3]) [4] . Ongoing 
research involves extending such analysis to more complex and irregular shapes. 
Application to real rock microstructures awaits progress in high energy X-ray 
tomography, as this will allow improved resolution of grain shapes. 
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(a) Tetragonal Prism (1:1:5) 



(b) Tetragonal Plate (1:5:5) 



(c) Orthorhombic cuboid (1:3:5) 



Figure 3: Crystalline structures used to measure grain sizes. We give the relative 
sizes of their dimensions in parentheses. 

The grain sizes are determined by generating databases of microstructures 
with irregular grain shapes in order to compare the estimated CSD of the actual 
grains to the computed or ideal CSD predicted by the governing equations. 
Because the CSDs in many igneous rocks are close to linear [3111], the problem 
can be simplified by using governing equations that generate linear CSDs with 
the following two rate laws. 

1. Nucleation Rate Law: N(t) = e at , where N is the number of new nuclei 
formed at each time step t and a is the nucleation constant. 

2. Crystal Growth Rate Law: G = AL/At, where AL/At is the rate of 
change of a grain diameter per time step. Grain sizes can be represented 
by short, intermediate, or long diameters. Such diameter classification 
depends on the relationship between the rate of grain nucleation and the 
rate of grain growth. 

One uses an ellipsoid to approximate the size and shape of each grain. There 
are multiple subjective choices for such ellipsoids that depend on the amount 
(i.e., the number of points) of the grain to be enclosed by the ellipsoid. To 
circumvent this subjectivity, it is desirable to compare the results of three types 
of ellipsoids: the ellipsoid that encloses the entire grain, the ellipsoid that is 
inscribed within the grain, and the mean of the enclosed and inscribed ellipsoids. 
See Fig. 2] for an illustration of an enclosing and an inscribed ellipsoid. 

The SVD is used in the determination of each of the three types of ellipsoids. 
Comparing the CSDs obtained using each of the three types of ellipsoids with 
those predicted by the governing equations reveals that the inscribed ellipsoids 
give the best results. In particular, one can use an algorithm developed by Nima 
Moshtagh [47] that employs the Khachiyan Algorithm [8] along with the SVD to 
obtain an ellipsoid that encloses an arbitrary number of points (which is defined 
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(b) Inscribed Ellipsoid 



Figure 4: Two possible ellipsoids used to approximate grain sizes. Because grain 
shapes are irregular, all ellipsoids are triaxial with three unequal diameters. 

by the user). Leonid Khachiyan introduced the ellipsoid method in 1979, and 
this was the first algorithm for linear programming that runs in polynomial time 
in the worst case. Given a matrix of data points P containing a discretized set 
of 3D points representing the crystal, one solves 



where Pi is the ith column of P, the matrix A contains information about the 
shape of the ellipsoid, and c is the center of the ellipsoid. 

Note that P in this case is dense, it has size n x 3, and n ps 5000. Once A 
and c have been determined, one calculates the ith radius of the /^-dimensional 
ellipse from the SVD of A using 



where <jj (i = 1, . . . , D) is the ith singular value of A. If the SVD of A is given 
by equation ([T]), then the orientation of the ellipsoid is given by the rotation 
matrix V. 

The major difficulty in such studies of igneous rock is that grain shapes 
and sizes are irregular due to competition for growth space among crystals. 
In particular, they are not of the ideal sizes and shapes that are assumed by 
crystallization theory. For example, crystals might start to grow with definite 
diameter ratios (yielding, for example, the prism, plate, or cuboid in Fig. [3j) but 
eventually develop irregular outlines. Current studies [4] suggest that one of the 
diameters or radii of the inscribed ellipsoid (as determined from the SVD) can be 
used as a measure of grain size for the investigation of crystal size distributions, 
but the problem remains open. 



minlog{det(A)} subject to (P; - c) T A(P t - c) < 1 



A,c 



(5) 




(6) 
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5 Quantum Information Society. 



From a physical perspective, information is encoded in the state of a physical 
system, and a computation is carried out on a physically realizable device |58j . 
Quantum information refers to information that is held in the state of a quantum 
system. Research in quantum computation and quantum information theory has 
helped lead to a revival of interest in linear algebra by physicists. In these stud- 
ies, the SVD (especially in the form of the Schmidt decomposition) have been 
crucial for gaining a better understanding of fundamental quantum-mechanical 
notions such as entanglement and measurement. 

Entanglement is a quantum form of correlation that is much stronger than 
classical correlation, and quantum information scientists use entanglement as a 
basic resource in the design of quantum algorithms [58| . The potential power of 
quantum computation relies predominantly on the inseparability of multipartite 
quantum states, and the extent of such interlocking can be measured using 
entanglement. 

We include only a brief discussion in the present article, but one can go 
much farther [55 ,,58, 62]. Whenever there are two distinguishable particles, one 
can fully characterize inseparable quantum correlations using what is known as 
a "single-particle reduced density matrix" (see the definition below), and the 
SVD is crucial for demonstrating that this is the case. See Refs. [53j[5Sl[62j for 
lots of details and all of the quantum mechanics notation that you'll ever desire. 

Suppose that one has two distinguishable particles A and B. One can then 
write a joint pure-state wave function which is expressed as an expansion 
in its states weighted by the probability that they occur. Note that we have 
written the wave function using Dirac (bra-ket) notation. It is a column vector, 
and its Hermitian conjugate is the row vector (^\. The prefactor for each term 
in the expansion of |\E') consists of the complex- valued components of an 
m x n probability matrix C, which satisfies tr(CC') = tr(C'C) = 1. (Recall 
that refers to the Hermitian conjugate of the matrix X.) 

Applying the SVD of C (i.e., letting C = USV', where U and V are unitary 
matrices) and transforming to a single-particle basis allows one to diagonalize 
| "J), which is said to be entangled if more than one singular value is nonzero. 
One can even measure the entanglement using the two-particle density matrix 
p := that is given by the outer product of the wave function with itself. 

One can then compute the von Neumann entanglement entropy 

min(n,m) 

a = - I^IM^I. (7) 

k=l 

Because \S%\ € [0,1], the entropy is zero for unentanglcd states and has the 
value ln[min(n, m)] for maximally entangled states. 

4 A unitary matrix U satisfies UU^ = 1 and is the complex-valued generalization of an 
orthogonal matrix. 
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The SVD is also important in other aspects ol quantum information. For 
example, it can be used to help construct measurements that are optimized to 
distinguish between a set of (possibly nonorthogonal) quantum states [21] . 

6 Can You Take Me Higher? 

As we have discussed, the SVD permeates numerous applications and is vital 
to data analysis. Moreover, with the availability of cheap memory and ad- 
vances in instrumentation and technology, it is now possible to collect and store 
enormous quantities of data for science, medical, and engineering applications. 
A byproduct of this wealth is an ever-increasing abundance of data that is 
fundamentally three-dimensional or higher. The information is thus stored in 
multiway arrays — i.e., as tensors — instead of as matrices. An order-p tensor A 
is a multiway array with p indices: 

A = (a tlt2 ... lp )£R n ^ x - xn - . 

Thus, a first-order tensor is a vector, a second-order tensor is a matrix, a third- 
order tensor is a "cube", and so on. See Fig. [5] for an illustration of a 2 x 2 x 2 
tensor. 




Figure 5: Illustration of a 2 x 2 x 2 tensor as a cube of data. This figure originally 
appeared in Ref. [33] and is used with permission from Elsevier. 



Applications involving operations with tensors are now widespread. They 
include chcmometrics [64] , psychometrics [37] , signal processing [l"5ll 171163] , com- 
puter vision [TOf73"] , data mining [TIIBT] , networks [3"5H8] , neuroscience [B145II46] , 
and many more. For example, the facial recognition algorithm Eigenfaces 
[1HJGIH1IE2] ^ as been extended to TensorFaces [7T]. To give another example, 
experiments have shown that fluorescence (i.e., the emission of light from a sub- 
stance) is modeled well using tensors, as the data follow a trilinear model [64]. 

A common thread in these applications is the need to manipulate the data, 
usually by compression, by taking advantage of its multidimensional structure 
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(see, for example, the recent article [5T]). Collapsing multiway data to matrices 
and using standard linear algebra to answer questions about the data often has 
undesirable consequences. It is thus important to consider the multiway data 
directly. 

Here we provide a brief overview of two types of higher-order extensions 
of the matrix SVD. For more information, see the extensive article on tensor 
decompositions [36] and references therein. Recall from ([2]) that the SVD is a 
rank-revealing decomposition. The outer product UivJ in equation @ is often 
written using the notation m o Just as the outer product of two vectors 
is a rank-1 matrix, the outer product of three vectors is a rank-1 third-order 
tensor. For example, if re € E™ 1 , y £ W 12 , and z £ K™ 3 , then the outer product 
x o y o z has dimension ri\ x 77,2 x 713 and is a rank-1 third-order tensor whose 
k)th entry is given by xiy^z^. Likewise, an outer product of four vectors 
gives a rank-1 fourth-order tensor, etc. For the rest of this discussion, we will 
limit our exposition to third-order tensors, but the concepts generalize easily to 
order-p tensors. 

The tensor rank r of an order-p tensor A is the minimum number of rank-1 
tensors that are needed to express the tensor. For a third-order tensor A € 
K" lX ™ !X ™ 3 , this implies the representation 



where cr, is a scaling constant. The scaling constants are the nonzero elements 
of an r x r x r diagonal tensor S = {cijk)- (As discussed in Ref. a tensor is 
called diagonal if the only nonzero entries occur in elements o~ijk with i = j = k.) 
The vectors Ui, Vi, and Wi are the ith columns from matrices U £ R™ lXr , 
V £ R" 2Xr , and W £ R™ 3Xr , respectively. 

One can think of equation © as an extension of the matrix SVD. Note, 
however, the following differences. 

1. The matrices U, V, and W in ((HJ are not constrained to be orthogonal. 
Furthermore, an orthogonal decomposition of the form ((5J) docs not exist, 
except in very special cases [213] . 

2. The maximum possible rank of a tensor is not given directly from the 
dimensions, as is the case with matrices0 However, loose upper bounds on 
rank do exist for higher-order tensors. Specifically, the maximum possible 
rank of an iii x n 2 x 77.3 tensor is bounded by min(ni772 , 7T.i7i3, 7^2^.3) in 
general [35] and [3n/2j in the case of n x n x 2 tensors [HUSSJEHIHS] • In 
practice, however, the rank is typically much less than these upper bounds. 
For example, Ref. [16] conjectures that the rank of a particular 9x9x9 
tensor is 19 or 20. 

3. Recall that the best rank-fc approximation to a matrix is given by the fcth 
partial sum in the SVD expansion (Thcorem[2]). However, this result does 

5 The maximum possible rank of an ni X 712 matrix is min(ni,n2). 



r 




(8) 
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not extend to higher-order tensors. In fact, the best rank- A: approximation 
to a tensor might not even exist [T9l[52] . 



4. There is no known closed-form solution to determine the rank r of a tensor 
a priori; in fact, the problem is NP-hard |29j . Rank determination of a 
tensor is a widely-studied problem |36| . 

In light of these major differences, there exists more than one higher-order 
version of the matrix SVD. The different available decompositions are moti- 
vated by the application areas. A decomposition of the form © is called a 
CANDECOMP-PARAFAC (CP) decomposition (CANonical DECOMPosition 
or PARAllel FACtors model) Q21[55] , whether or not r is known to be minimal. 
However, since an orthogonal decomposition of the form (JSJ docs not always 
exist, a Tucker3 form is often used to guarantee the existence of an orthogonal 
decomposition as well as to better model certain data [5tJll6"TllTlTl73| . 

If A is an rii x fi 2 x "-3 tensor, then its Tucker3 decomposition has the 
form [57] 



where Ui, Vj, and Wk are the ith, jth, and kth columns of the matrices U € 
Rni x mi ^er^^.andWe R« 3 xm 3 _ often, U, V, and W have orthonormal 
columns. The tensor S = (cijk) S ^n 1 xm 2 xm 3 j g ca vj ec [ the core tensor. In 
general, the core tensor S is dense and the decomposition © does not reveal its 
rank. Equation (J5J) has also been called the higher-order SVD (HOSVD) (TB] , 
though the term "HOSVD" actually refers to a method for computation [36] . 
Reference [18] demonstrates that the HOSVD is a convincing extension of the 
matrix SVD. The HOSVD is guaranteed to exist, and it computes ((9]) directly by 
calculating the SVDs of the three matrices obtained by "flattening" the tensor 
into matrices in each dimension and then using those results to assemble the 
core tensor. Yet another extension of the matrix SVD factors a tensor as a 
product of tensors rather than as an outer product of vectors [33][33] . 



7 Everywhere You Go, Always Take the SVD 
With You. 



The SVD is a fascinating, fundamental object of study that has provided a 
great deal of of insight into a diverse array of problems, which range from social 
network analysis and quantum information theory to applications in geology. 
The matrix SVD has also served as the foundation from which to conduct data 
analysis of multiway data by using its higher-dimensional tensor versions. The 
abundance of workshops, conference talks, and journal papers in the past decade 
on multilinear algebra and tensors also demonstrates the explosive growth of ap- 
plications for tensors and tensor SVDs. The SVD is an omnipresent factorization 
in a plethora of application areas. We recommend it highly. 



mi m2 m3 




(9) 



i=i j=i k=i 
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